ndex for using different partitioning rules for the data shown in
37(a). It can be seen that they indeed reach the optimal (the
m Gini index and the maximum information gain) when the
ng rule is y = 2 for the data shown in Figure 3.37(a).
he classification and regression tree algorithm
ting a classification tree model is based on a series of optimisation
s, each of which is implemented by measuring the value of a
impurity index aforementioned for each partitioning rule. The
lgorithm can be used for both classification analysis and
n analysis. It works in three steps [Breiman, et al., 1984].
first step, a tree is grown for a data set based on the recursive
ng of a data space using a series of partitioning rules. A decision-
ode is created in a tree for every partitioning rule. Every node is
d of two parts. One is a selected variable (such as x) and the other
cted threshold (such as T). The selection of T is based on the
tion of the Gini index or the maximisation of the entropy index
rmation gain) among many candidate partitioning rules. A logic
is used to combine them to generate a logic expression for
g a partitioning rule, such as x < T. A data space, which is either
al space or a previously partitioned subspace, is called a working
r testing a newly generated partitioning rule. Suppose a
ng rule is defined as a logic expression, i.e., x < T. A working
divided into two subspaces by testing this logic expression. One
ubspaces is composed of data points which satisfy this logic
n, i.e., x < T. The other subspace is composed of the rest of data
hese data points violate this logic expression, i.e., ݔ≮ܶ or ݔ
purity is evaluated for each newly partitioned subspace and hence
de. If the Gini index is zero, no further partition is required from
node. Each node below this node is thus labelled according to the
perty of the data points in the corresponding subspace. Such a
h a labelled class is called a leaf in a decision-making tree.